Probabilistic Models for Detecting Census Person Duplication

نویسنده

  • Robert E. Fay
چکیده

The net undercount of the population by the decennial census arises from the balance between: (1) omissions of persons the census should count but misses, and (2) erroneous enumerations the census incorrectly includes. Duplication is a form of erroneous enumeration; typically a duplicated person is counted correctly where they should be but also incorrectly elsewhere. Coverage measurement surveys, such as the 2000 Accuracy and Coverage Evaluation (A.C.E.), must account for the effect of census duplication in order to accurately measure the net undercount. Because Census 2000 captured both names and dates of birth for most respondents, computer matching can identify possible duplicate enumerations. Exact matches on name and date of birth appear to be likely evidence for duplication, but such matches include persons with the same name coincidentally sharing birthdays. The paper develops probabilistic expressions for exact matches for the relative effects of duplication and coincidental sharing of birthday.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

P2P Decentralized Population Census

We describe a framework and investigate techniques for running decentralized census processes that enable observers to independently verify governmental data. Census is a process impacting important issues such as the amount of funding that a community will get from a central government and its representation in the Congress. Correct census is essential for detecting vote stuffing. Census has b...

متن کامل

Reputation System for Decentralized Population Census

We describe a framework and techniques for running decentralized census processes that enable observers to independently verify governmental data. Census is a process impacting important issues such as the representation of a community in the Congress and the amount of funding that it gets from a central government. Correct census is essential for detecting vote stuffing. Reliable census data c...

متن کامل

Implicational Scaling of Reading Comprehension Construct: Is it Deterministic or Probabilistic?

In English as a Second Language Teaching and Testing situations, it is common to infer about learners’ reading ability based on his or her total score on a reading test. This assumes the unidimensional and reproducible nature of reading items. However, few researches have been conducted to probe the issue through psychometric analyses. In the present study, the IELTS exemplar module C (1994) wa...

متن کامل

Using Administrative Record Persons in the 1996 Community Census

I. Introduction Information from administrative records are Medicare Enrollment Database from Health Care a possible source of people and housing units missed in Financing Administration (HCFA), and the the Census. This paper documents the evaluation of Registration File from Selective Service. the administrative records person level data used in the-In addition to the files above, for Chicago ...

متن کامل

Replica procedure for probabilistic algorithms as a model of gene duplication

In the present paper we propose to describe gene networks in biological systems using probabilistic algorithms. We describe gene duplication in the process of biological evolution using introduction of the replica procedure for probabilistic algorithms. We construct the examples of such a replica procedure for hidden Markov models. We introduce the family of hidden Markov models where the set o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002